Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms
نویسندگان
چکیده
Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction. Keywords—Bioassay, machine learning, preprocessing, virtual screen.
منابع مشابه
Prediction of Breast Tumor Malignancy Using Neural Network and Whale Optimization Algorithms (WOA)
Introduction: Breast cancer is the most prevalent cause of cancer mortality among women. Early diagnosis of breast cancer gives patients greater survival time. The present study aims to provide an algorithm for more accurate prediction and more effective decision-making in the treatment of patients with breast cancer. Methods: The present study was applied, descriptive-analytical, based on the ...
متن کاملMachine learning algorithms for time series in financial markets
This research is related to the usefulness of different machine learning methods in forecasting time series on financial markets. The main issue in this field is that economic managers and scientific society are still longing for more accurate forecasting algorithms. Fulfilling this request leads to an increase in forecasting quality and, therefore, more profitability and efficiency. In this pa...
متن کاملAccuracy Improvement of Mood Disorders Prediction using a Combination of Data Mining and Meta-Heuristic Algorithms
Introduction: Since the delay or mistake in the diagnosis of mood disorders due to the similarity of their symptoms hinders effective treatment, this study aimed to accurately diagnose mood disorders including psychosis, autism, personality disorder, bipolar, depression, and schizophrenia, through modeling and analyzing patients' data. Method: Data collected in this applied developmental resear...
متن کاملAccuracy Improvement of Mood Disorders Prediction using a Combination of Data Mining and Meta-Heuristic Algorithms
Introduction: Since the delay or mistake in the diagnosis of mood disorders due to the similarity of their symptoms hinders effective treatment, this study aimed to accurately diagnose mood disorders including psychosis, autism, personality disorder, bipolar, depression, and schizophrenia, through modeling and analyzing patients' data. Method: Data collected in this applied developmental resear...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017